Search CORE

30 research outputs found

An efficient MapReduce-based parallel clustering algorithm for distributed traffic subarea division

Author: Li Yantao
Rong Zhuobo
Wang Binfeng
Xia Dawen
Zhang Zili
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Traffic subarea division is vital for traffic system management and traffic network analysis in intelligent transportation systems (ITSs). Since existing methods may not be suitable for big traffic data processing, this paper presents a MapReduce-based Parallel Three-Phase K -Means (Par3PKM) algorithm for solving traffic subarea division problem on a widely adopted Hadoop distributed computing platform. Specifically, we first modify the distance metric and initialization strategy of K -Means and then employ a MapReduce paradigm to redesign the optimized K -Means algorithm for parallel clustering of large-scale taxi trajectories. Moreover, we propose a boundary identifying method to connect the borders of clustering results for each cluster. Finally, we divide traffic subarea of Beijing based on real-world trajectory data sets generated by 12,000 taxis in a period of one month using the proposed approach. Experimental evaluation results indicate that when compared with K -Means, Par2PK-Means, and ParCLARA, Par3PKM achieves higher efficiency, more accuracy, and better scalability and can effectively divide traffic subarea with big taxi trajectory data

Deakin Research Online

Directory of Open Access Journals

A MapReduce-based nearest neighbor approach for big-data-driven traffic flow prediction

Author: Li Huaqing
Li Yantao
Wang Binfeng
Xia Dawen
Zhang Zili
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

In big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for traffic flow prediction using correlation analysis (TFPC) on a Hadoop platform. In particular, we develop a real-time prediction system including two key modules, i.e., offline distributed training (ODT) and online parallel prediction (OPP). Moreover, we build a parallel k-nearest neighbor optimization classifier, which incorporates correlation information among traffic flows into the classification process. Finally, we propose a novel prediction calculation method, combining the current data observed in OPP and the classification results obtained from large-scale historical data in ODT, to generate traffic flow prediction in real time. The empirical study on real-world traffic flow big data using the leave-one-out cross validation method shows that TFPC significantly outperforms four state-of-the-art prediction approaches, i.e., autoregressive integrated moving average, Naïve Bayes, multilayer perceptron neural networks, and NN regression, in terms of accuracy, which can be improved 90.07% in the best case, with an average mean absolute percent error of 5.53%. In addition, it displays excellent speedup, scaleup, and sizeup

Deakin Research Online

Trust in Software Supply Chains: Blockchain-Enabled SBOM and the AIBOM Future

Author: Liu Yue
Lu Qinghua
Xia Boming
Xing Zhenchang
Zhang Dawen
Zhu Liming
Publication venue
Publication date: 10/07/2023
Field of study

Software Bill of Materials (SBOM) serves as a critical pillar in ensuring software supply chain security by providing a detailed inventory of the components and dependencies integral to software development. However, challenges abound in the sharing of SBOMs, including potential data tampering, hesitation among software vendors to disclose comprehensive information, and bespoke requirements from software procurers or users. These obstacles have stifled widespread adoption and utilization of SBOMs, underscoring the need for a more secure and flexible mechanism for SBOM sharing. This study proposes a novel solution to these challenges by introducing a blockchain-empowered approach for SBOM sharing, leveraging verifiable credentials to allow for selective disclosure. This strategy not only heightens security but also offers flexibility. Furthermore, this paper broadens the remit of SBOM to encompass AI systems, thereby coining the term AI Bill of Materials (AIBOM). This extension is motivated by the rapid progression in AI technology and the escalating necessity to track the lineage and composition of AI software and systems. Particularly in the era of foundational models like large language models (LLMs), understanding their composition and dependencies becomes crucial. These models often serve as a base for further development, creating complex dependencies and paving the way for innovative AI applications. The evaluation of our solution indicates the feasibility and flexibility of the proposed SBOM sharing mechanism, positing a new solution for securing (AI) software supply chains

arXiv.org e-Print Archive

Hydrological cycle and water resources in a changing world: A review

Author: Dawen Yang
Jun Xia
Yuting Yang
Publication venue: 'Elsevier BV'
Publication date: 01/06/2021
Field of study

Water is the fundamental natural resource that supports life, ecosystems and human society. Thus studying the water cycle is important for sustainable development. In the context of global climate change, a better understanding of the water cycle is needed. This study summarises current research and highlights future directions of water science from four perspectives: (i) the water cycle; (ii) hydrologic processes; (iii) coupled natural-social water systems; and (iv) integrated watershed management. Emphasis should be placed on understanding the joint impacts of climate change and human activities on hydrological processes and water resources across temporal and spatial scales. Understanding the interactions between land and atmosphere are keys to addressing this issue. Furthermore systematic approaches should be developed for large basin studies. Areas for focused research include: variations of cryosphere hydrological processes in upper alpine zones; and human activities on the water cycle and relevant biogeochemical processes in middle-lower reaches. Because the water cycle is naturally coupled with social characteristics across multiple scales, multi-process and multi-scale models are needed. Hydrological studies should use this new paradigm as part of water-food-energy frontier research. This will help to promote interdisciplinary study across natural and social sciences in accordance with the United Nation's sustainable development goals

Directory of Open Access Journals

Complex statistical analysis of big data: implementation and application of Apriori and FP-Growth algorithm based on MapReduce

Author: Rong Zhuobo
Xia Dawen
Zhang Zili
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Deakin Research Online

A distributed spatial-temporal weighted model on MapReduce for short-term traffic flow forecasting

Author: Li Huaqing
Li Yantao
Wang Binfeng
Xia Dawen
Zhang Zili
Publication venue: 'Elsevier BV'
Publication date: 29/02/2016
Field of study

Accurate and timely traffic flow prediction is crucial to proactive traffic management and control in data-driven intelligent transportation systems (D2ITS), which has attracted great research interest in the last few years. In this paper, we propose a Spatial-Temporal Weighted K-Nearest Neighbor model, named STW-KNN, in a general MapReduce framework of distributed modeling on a Hadoop platform, to enhance the accuracy and efficiency of short-term traffic flow forecasting. More specifically, STW-KNN considers the spatial-temporal correlation and weight of traffic flow with trend adjustment features, to optimize the search mechanisms containing state vector, proximity measure, prediction function, and K selection. urthermore, STW-KNN is implemented on a widely adopted Hadoop distributed computing platform with the MapReduce parallel processing paradigm, for parallel prediction of traffic flow in real time. inally, with extensive experiments on real-world big taxi trajectory data, STW-KNN is compared with the state-of-the-art prediction models including conventional K-Nearest Neighbor (KNN), Artificial Neural Networks (ANNs), Naïve Bayes (NB), Random orest (R), and C4.. The results demonstrate that the proposed model is superior to existing models on accuracy by decreasing the mean absolute percentage error (MAPE) value more than 11.9% only in time domain and even achieves 89.71% accuracy improvement with the MAPEs of between 4% and 6.% in both space and time domains, and also significantly improves the efficiency and scalability of short-term traffic flow forecasting over existing approaches

Deakin Research Online

Alkyl Sulfoxides as Radical Precursors and Their Use in the Synthesis of Pyridine Derivatives

Author: Dawen Niu
Demeng Xie
Xia Zhang
Yingwei Wang
Zhengyan Fu
Publication venue
Publication date: 09/03/2022
Field of study

We report here the use of simple and readily available alkyl sulfoxides as precursors to radicals and their application in the preparation of pyridine derivatives. We show that alkyl sulfoxides form EDA complexes with N-methoxy pyridinium salts, which upon visible light irradiation, undergo a cascade of radical processes to afford pyridine derivatives smoothly. This method displays broad scope with respect to both reactants. The synthetic versatility of sulfoxides as a handle in chemistry adds to the power of this transformation. The method is further applied in the synthesis of various pyridyl C–glycosides that are previously difficult to access

ChemRxiv

Robust traffic classification with mislabelled training samples

Author: Luo Wei
Wang Binfeng
Xia Dawen
Zhang Jun
Zhang Zili
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Traffic classification plays the significant role in the network security and management. However, accurate classification is challenging if the training data is contaminated with unclean traffic. Recent researches often assume clean training data, and hence performance reduced on real-time network traffic. To meet this challenge, in this paper, we propose a robust method, Unclean Traffic Classification (UTC), which incorporates noise elimination and suspected noise reweighting. Firstly, UTC eliminates strong noisy training data identified by a consensus filtering with multiple classifiers. Furthermore, UTC estimates the relevance of remaining training data and learns a robust traffic classifier. Through a number of experiments on a real-world traffic dataset, we show that the new method outperforms existing state-of-the-art traffic classification methods, under the extremely difficult circumstance with unclean training data

Deakin Research Online

Convergence of an accelerated distributed optimisation algorithm over time‐varying directed networks

Author: Dawen Xia
Huaqing Li
Jing Guo
Jinhui Hu
Yu Yan
Zheng Wang
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2021
Field of study

Abstract In this article, studying distributed optimisation over time‐varying directed networks where a group of agents aims at cooperatively minimising a sum of local objective functions is focused on. Each agent uses only local computation and communication in the overall process without leaking their private information. Via incorporating both a distributed heavy‐ball method and a distributed Nesterov method, a double accelerated distributed algorithm leveraging a gradient‐tracking technique and using uncoordinated step‐sizes, is developed. By employing both row‐ and column‐stochastic weight matrices, the proposed algorithm can bypass the implementation of doubly stochastic weight matrices and avoid eigenvector estimation existing in some algorithms using only row‐ or column‐stochastic weight matrices. Under the assumptions that the agents' local objective functions are smooth and strongly convex, and the aggregated directed networks of every finite consecutive directed network are strongly connected, the proposed algorithm is proved to converge linearly to the global optimal solution when the largest step‐size is positive and sufficiently small, and the largest momentum parameter is non‐negative. The proposed algorithm is also applied to fixed directed networks which are considered as a special case of time‐varying directed networks. Simulation results further verify the effectiveness of the proposed algorithm and correctness of the theoretical findings

Directory of Open Access Journals